A Regression-Based Temporal Pattern Mining Scheme for Data Streams
نویسندگان
چکیده
We devise in this paper a regression-based algorithm, called algorithm FTP-DS (Frequent Temporal Patterns of Data Streams), to mine frequent temporal patterns for data streams. While providing a general framework of pattern frequency counting, algorithm FTP-DS has two major features, namely one data scan for online statistics collection and regressionbased compact pattern representation. To attain the feature of one data scan, the data segmentation and the pattern growth scenarios are explored for the frequency counting purpose. Algorithm FTP-DS scans online transaction flows and generates candidate frequent patterns in real time. The second important feature of algorithm FTP-DS is on the regression-based compact pattern representation. Specifically, to meet the space constraint, we devise for pattern representation a compact ATF (standing for Accumulated Time and Frequency) form to aggregately comprise all the information required for regression analysis. In addition, we develop the techniques of the segmentation tuning and segment relaxation to enhance the functions of FTP-DS. With these features, algorithm FTP-DS is able to not only conduct mining with variable time intervals but also perform trend detection effectively. Synthetic data and a real dataset which contains netPermission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 29th VLDB Conference, Berlin, Germany, 2003 work alarm logs from a major telecommunication company are utilized to verify the feasibility of algorithm FTP-DS.
منابع مشابه
Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملPredicting Sequential Pattern Changes in Data Streams
Data streams are utilized in an increasing number of real-time information technology applications. Unlike traditional datasets, data streams are temporally ordered, fast changing and massive. Due to their tremendous volume, performing multiple scans of the entire data stream is impractical. Thus, traditional sequential pattern mining algorithms cannot be applied. Accordingly, the present study...
متن کاملA survey of temporal data mining
Data mining is concerned with analysing large volumes of (often unstructured) data to automatically discover interesting regularities or relationships which in turn lead to better understanding of the underlying processes. The field of temporal data mining is concerned with such analysis in the case of ordered data streams with temporal interdependencies. Over the last decade many interesting t...
متن کاملMining State Dependencies Between Multiple Sensor Data Sources
Pattern mining over data streams is critical to a variety of applications such as prediction and evolution of weather phenomena or anomaly detection in security applications. Most of the current techniques attempt to discover associations between events appearing on the same data stream but are not able to discover associations over multiple heterogeneous data streams. In this work, we aim to i...
متن کاملProtection Scheme of Power Transformer Based on Time–Frequency Analysis and KSIR-SSVM
The aim of this paper is to extend a hybrid protection plan for Power Transformer (PT) based on MRA-KSIR-SSVM. This paper offers a new scheme for protection of power transformers to distinguish internal faults from inrush currents. Some significant characteristics of differential currents in the real PT operating circumstances are extracted. In this paper, Multi Resolution Analysis (MRA) is use...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003